Skip to content

Conversation

@dmpots
Copy link
Contributor

@dmpots dmpots commented Nov 24, 2025

This commit modifies the dwarf expression evaluator in how we handle the deref operation for register and implicit locations on the stack. For a typical memory location a deref operation will read the value from memory. For register and implicit locations the deref operation will read the value from the register or its implicit location. In lldb we eagerly read register and implicit values and push them on the stack so the deref operation for these becomes a "no-op" that leaves the value on the stack and updates the tracked location kind.

The motivation for this change is to handle DW_OP_deref* operations on location descriptions as described by the heterogenious debugging extensions.

Specifically, for register locations it states

These operations obtain a register location. To fetch the contents of
a register, it is necessary to use DW_OP_regval_type, use one of the
DW_OP_breg* register-based addressing operations, or use DW_OP_deref* on
a register location description.

My understanding is that this is the intended behavior from dwarf5 as well and is not a change in behavior.

@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2025

@llvm/pr-subscribers-lldb

Author: David Peixotto (dmpots)

Changes

This commit modifies how we handle the deref operation for register and implicit locations on the stack. For a typical memory location a deref operation will read the value from memory. For register and implicit locations the deref operation will read the value from the register or its implicit location. In lldb we eagerly read register and implicit values and push them on the stack so the deref operation for these becomes a "no-op" that leaves the value on the stack and updates the tracked location kind.

The motivation for this change is to handle DW_OP_deref* operations on location descriptions as described by the heterogenious debugging extensions.

Specifically, for register locations it states

> These operations obtain a register location. To fetch the contents of
> a register, it is necessary to use DW_OP_regval_type, use one of the
> DW_OP_breg* register-based addressing operations, or use DW_OP_deref* on
> a register location description.

My understanding is that this is the intended behavior from dwarf5 as well and is not a change in behavior.


Full diff: https://github.com/llvm/llvm-project/pull/169419.diff

2 Files Affected:

  • (modified) lldb/source/Expression/DWARFExpression.cpp (+42-5)
  • (modified) lldb/unittests/Expression/DWARFExpressionTest.cpp (+104)
diff --git a/lldb/source/Expression/DWARFExpression.cpp b/lldb/source/Expression/DWARFExpression.cpp
index 4f9d6ebf27bf0..3820e1e002ff9 100644
--- a/lldb/source/Expression/DWARFExpression.cpp
+++ b/lldb/source/Expression/DWARFExpression.cpp
@@ -861,13 +861,30 @@ ResolveLoadAddress(ExecutionContext *exe_ctx, lldb::ModuleSP &module_sp,
   return load_addr;
 }
 
-static llvm::Error Evaluate_DW_OP_deref(DWARFExpression::Stack &stack,
-                                        ExecutionContext *exe_ctx,
-                                        lldb::ModuleSP module_sp,
-                                        Process *process) {
+static llvm::Error Evaluate_DW_OP_deref(
+    DWARFExpression::Stack &stack, ExecutionContext *exe_ctx,
+    lldb::ModuleSP module_sp, Process *process,
+    LocationDescriptionKind &dwarf4_location_description_kind) {
   if (stack.empty())
     return llvm::createStringError("expression stack empty for DW_OP_deref");
 
+  // Handle deref of a register or implicit location.
+  // When the current value is a register or implicit location description then
+  // a deref operation should read the value from that location. We eagerly
+  // read the register and implicit values and so its value is already on top of
+  // the stack. We just need to reset the value context and description to their
+  // defaults.
+  if (dwarf4_location_description_kind == Register ||
+      dwarf4_location_description_kind == Implicit) {
+    // Reset context to default values.
+    dwarf4_location_description_kind = Memory;
+    stack.back().ClearContext();
+
+    // The value is already on top of the stack so there is nothing
+    // more to do here.
+    return llvm::Error::success();
+  }
+
   const Value::ValueType value_type = stack.back().GetValueType();
   switch (value_type) {
   case Value::ValueType::HostAddress: {
@@ -1080,7 +1097,8 @@ llvm::Expected<Value> DWARFExpression::Evaluate(
     // target machine.
     case DW_OP_deref: {
       if (llvm::Error err =
-              Evaluate_DW_OP_deref(stack, exe_ctx, module_sp, process))
+              Evaluate_DW_OP_deref(stack, exe_ctx, module_sp, process,
+                                   dwarf4_location_description_kind))
         return err;
     } break;
 
@@ -1106,6 +1124,25 @@ llvm::Expected<Value> DWARFExpression::Evaluate(
         return llvm::createStringError(
             "Invalid address size for DW_OP_deref_size: %d\n", size);
       }
+
+      // Deref a register or implicit location and truncate the value to `size`
+      // bytes. See the corresponding comment in DW_OP_deref for more details on
+      // why we deref these locations this way.
+      if (dwarf4_location_description_kind == Register ||
+          dwarf4_location_description_kind == Implicit) {
+        // Reset context to default values.
+        dwarf4_location_description_kind = Memory;
+        stack.back().ClearContext();
+
+        // Truncate the value on top of the stack to *size* bytes then
+        // extend to the size of an address (e.g. generic type).
+        Scalar scalar = stack.back().GetScalar();
+        scalar.TruncOrExtendTo(size * 8, /*sign=*/false);
+        scalar.TruncOrExtendTo(opcodes.GetAddressByteSize() * 8,
+                               /*sign=*/false);
+        stack.back().GetScalar() = scalar;
+        break;
+      }
       Value::ValueType value_type = stack.back().GetValueType();
       switch (value_type) {
       case Value::ValueType::HostAddress: {
diff --git a/lldb/unittests/Expression/DWARFExpressionTest.cpp b/lldb/unittests/Expression/DWARFExpressionTest.cpp
index e0c2193d27c36..112159e261835 100644
--- a/lldb/unittests/Expression/DWARFExpressionTest.cpp
+++ b/lldb/unittests/Expression/DWARFExpressionTest.cpp
@@ -1216,3 +1216,107 @@ TEST_F(DWARFExpressionMockProcessTestWithAArch, DW_op_deref_no_ptr_fixing) {
   llvm::Expected<Value> result_deref = evaluate_expr(expr_deref);
   EXPECT_THAT_EXPECTED(result_deref, ExpectLoadAddress(expected_value));
 }
+
+TEST_F(DWARFExpressionMockProcessTest, deref_register) {
+  TestContext test_ctx;
+  constexpr uint32_t reg_r0 = 0x504;
+  MockMemory::Map memory = {
+      {{0x004, 4}, {0x1, 0x2, 0x3, 0x4}},
+      {{0x504, 4}, {0xa, 0xb, 0xc, 0xd}},
+      {{0x505, 4}, {0x5, 0x6, 0x7, 0x8}},
+  };
+  ASSERT_TRUE(CreateTestContext(&test_ctx, "i386-pc-linux",
+                                RegisterValue(reg_r0), memory, memory));
+
+  ExecutionContext exe_ctx(test_ctx.process_sp);
+  MockDwarfDelegate delegate = MockDwarfDelegate::Dwarf5();
+  auto Eval = [&](llvm::ArrayRef<uint8_t> expr_data) {
+    ExecutionContext exe_ctx(test_ctx.process_sp);
+    return Evaluate(expr_data, {}, &delegate, &exe_ctx,
+                    test_ctx.reg_ctx_sp.get());
+  };
+
+  // Reads from the register r0.
+  // Sets the context to RegisterInfo so we know this is a register location.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_reg0}),
+                       ExpectScalar(reg_r0, Value::ContextType::RegisterInfo));
+
+  // Reads from the location(register r0).
+  // Clears the context so we know this is a value not a location.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_reg0, DW_OP_deref}),
+                       ExpectLoadAddress(reg_r0, Value::ContextType::Invalid));
+
+  // Reads from the location(register r0) and adds the value to the host buffer.
+  // The evaluator should implicitly convert it to a memory location when
+  // added to a composite value and should add the contents of memory[r0]
+  // to the host buffer.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_reg0, DW_OP_deref, DW_OP_piece, 4}),
+                       ExpectHostAddress({0xa, 0xb, 0xc, 0xd}));
+
+  // Reads from the location(register r0) and truncates the value to one byte.
+  // Clears the context so we know this is a value not a location.
+  EXPECT_THAT_EXPECTED(
+      Eval({DW_OP_reg0, DW_OP_deref_size, 1}),
+      ExpectLoadAddress(reg_r0 & 0xff, Value::ContextType::Invalid));
+
+  // Reads from the location(register r0) and truncates to one byte then adds
+  // the value to the host buffer. The evaluator should implicitly convert it to
+  // a memory location when added to a composite value and should add the
+  // contents of memory[r0 & 0xff] to the host buffer.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_reg0, DW_OP_deref_size, 1, DW_OP_piece, 4}),
+                       ExpectHostAddress({0x1, 0x2, 0x3, 0x4}));
+
+  // Reads from the register r0 + 1.
+  EXPECT_THAT_EXPECTED(
+      Eval({DW_OP_breg0, 1}),
+      ExpectLoadAddress(reg_r0 + 1, Value::ContextType::Invalid));
+
+  // Reads from address r0 + 1, which contains the bytes [5,6,7,8].
+  EXPECT_THAT_EXPECTED(
+      Eval({DW_OP_breg0, 1, DW_OP_deref}),
+      ExpectLoadAddress(0x08070605, Value::ContextType::Invalid));
+}
+
+TEST_F(DWARFExpressionMockProcessTest, deref_implicit_value) {
+  TestContext test_ctx;
+  MockMemory::Map memory = {
+      {{0x4, 1}, {0x1}},
+      {{0x4, 4}, {0x1, 0x2, 0x3, 0x4}},
+  };
+  ASSERT_TRUE(CreateTestContext(&test_ctx, "i386-pc-linux", {}, memory));
+
+  ExecutionContext exe_ctx(test_ctx.process_sp);
+  MockDwarfDelegate delegate = MockDwarfDelegate::Dwarf5();
+  auto Eval = [&](llvm::ArrayRef<uint8_t> expr_data) {
+    ExecutionContext exe_ctx(test_ctx.process_sp);
+    return Evaluate(expr_data, {}, &delegate, &exe_ctx,
+                    test_ctx.reg_ctx_sp.get());
+  };
+
+  // Creates an implicit location with a value of 4.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_lit4, DW_OP_stack_value}),
+                       ExpectScalar(0x4));
+
+  // Creates an implicit location with a value of 4. The deref reads the value
+  // out of the location and implicitly converts it to a load address.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_lit4, DW_OP_stack_value, DW_OP_deref}),
+                       ExpectLoadAddress(0x4));
+
+  // Creates an implicit location with a value of 0x504 (uleb128(0x504) =
+  // 0xa84). The deref reads the low byte out of the location and implicitly
+  // converts it to a load address.
+  EXPECT_THAT_EXPECTED(
+      Eval({DW_OP_constu, 0x84, 0xa, DW_OP_stack_value, DW_OP_deref_size, 1}),
+      ExpectLoadAddress(0x4));
+
+  // The tests below are similar to the ones above, but there is no implicit
+  // location created by a stack_value operation. They are provided here as a
+  // reference to contrast with the above tests.
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_lit4}), ExpectLoadAddress(0x4));
+
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_lit4, DW_OP_deref}),
+                       ExpectLoadAddress(0x04030201));
+
+  EXPECT_THAT_EXPECTED(Eval({DW_OP_lit4, DW_OP_deref_size, 1}),
+                       ExpectLoadAddress(0x01));
+}

@dmpots
Copy link
Contributor Author

dmpots commented Nov 24, 2025

Here are some dwarf-dump examples of the dwarf expressions generated for simple thread-local variables from a hip kernel compiled with hipcc.

The SGPR33 register is a the base register for the frame and the "idx" variable is stored at address 0x30(sgpr33) in address space 5.

rocm-7.0

0x0000036d:     DW_TAG_variable
                  DW_AT_location	(DW_OP_regx SGPR33, DW_OP_deref_size 0x4, DW_OP_constu 0x30, DW_OP_plus, DW_OP_stack_value, DW_OP_deref_size 0x4, DW_OP_lit5, DW_OP_LLVM_user DW_OP_LLVM_form_aspace_address)
                  DW_AT_name	("idx")
                  DW_AT_decl_file	("hello_world.hip")
                  DW_AT_decl_line	(16)
                  DW_AT_type	(0x0000007f "unsigned int")

rocm-6.4.0

0x00000323:     DW_TAG_variable
                  DW_AT_location	(DW_OP_regx SGPR33, DW_OP_deref_size 0x4, DW_OP_constu 0x5, DW_OP_LLVM_user DW_OP_LLVM_form_aspace_address, DW_OP_constu 0x30, DW_OP_LLVM_user DW_OP_LLVM_offset)
                  DW_AT_name	("idx")
                  DW_AT_decl_file	("hello_world.hip")
                  DW_AT_decl_line	(16)
                  DW_AT_type	(0x0000007f "unsigned int")

Comment on lines 1127 to 1145

// Deref a register or implicit location and truncate the value to `size`
// bytes. See the corresponding comment in DW_OP_deref for more details on
// why we deref these locations this way.
if (dwarf4_location_description_kind == Register ||
dwarf4_location_description_kind == Implicit) {
// Reset context to default values.
dwarf4_location_description_kind = Memory;
stack.back().ClearContext();

// Truncate the value on top of the stack to *size* bytes then
// extend to the size of an address (e.g. generic type).
Scalar scalar = stack.back().GetScalar();
scalar.TruncOrExtendTo(size * 8, /*sign=*/false);
scalar.TruncOrExtendTo(opcodes.GetAddressByteSize() * 8,
/*sign=*/false);
stack.back().GetScalar() = scalar;
break;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reuse the Evaluate_DW_OP_deref function call from line 864 and pass in the the deref size as a new argument. So pass in size to Evaluate_DW_OP_deref for this DW_OP_deref_size. For the actual DW_OP_deref we can pass in the register size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayborg Yeah, Its not nice that we have two nearly identical implementations of deref. There are some differences between them, but they may not be important. For example, DW_OP_deref reads the FileAddress from the process after converting to a load address, but DW_OP_deref_size reads a FileAddress from the target.

Let me create a separate PR to unify the implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created the PR in #169587

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayborg I rebased this change on top of the refactored code. Please take a look when you get a chance.

This commit modifies how we handle the deref operation for register and
implicit locations on the stack. For a typical memory location a deref
operation will read the value from memory. For register and implicit
locations the deref operation will read the value from the register or
its implicit location. In lldb we eagerly read register and implicit
values and push them on the stack so the deref operation for these
becomes a "no-op" that leaves the value on the stack and updates the
tracked location kind.

The motivation for this change is to handle `DW_OP_deref*` operations on
location descriptions as described by the heterogenious debugging
[extensions](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-4-4-register-location-description-operations).

Specifically, for register locations it states

> These operations obtain a register location. To fetch the contents of
> a register, it is necessary to use DW_OP_regval_type, use one of the
> DW_OP_breg* register-based addressing operations, or use DW_OP_deref*
on
> a register location description.

My understanding is that this is the intended behavior from dwarf5 as
well and is not a change in behavior.
// Truncate the value on top of the stack to *size* bytes then
// extend to the size of an address (e.g. generic type).
Scalar scalar = stack.back().GetScalar();
scalar.TruncOrExtendTo(size * 8, /*sign=*/false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be interesting to also add the DW_OP_LLVM_offset operator here just to be able to handle bigger registers than the size of the global address space.

For the DW_OP_LLVM_bit_offset variant, we might want to add that one later considering that DW_OP_bit_piece was also never implemented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets wait on DW_OP_LLVM_offset for now until we need it.

// converts it to a load address.
EXPECT_THAT_EXPECTED(
Eval({DW_OP_constu, 0x84, 0xa, DW_OP_stack_value, DW_OP_deref_size, 1}),
ExpectLoadAddress(0x4));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right or I am not understanding how the uleb128 output is represented here.

Shouldn't dereferencing of the stack value be the value itself as a load address (aka value of 0x84)? Also what does 0xa byte represents here?

I would expect for the expression to look something more like this:
DW_OP_constu 0x84 DW_OP_stack_value DW_OP_deref_size 1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are passing raw bytes in here so the ULEB128 value for the DW_OP_constu bytes are [0x84, 0x0a]

Copy link
Contributor Author

@dmpots dmpots Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZaricZoran As Greg mentioned this is the ULEB encoded value and ULEB(0x504) == [0x84, 0x0a] (in little endian byte order). So when we take the lower byte the result should be 0x4.

// Truncate the value on top of the stack to *size* bytes then
// extend to the size of an address (e.g. generic type).
Scalar scalar = stack.back().GetScalar();
scalar.TruncOrExtendTo(size * 8, /*sign=*/false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets wait on DW_OP_LLVM_offset for now until we need it.

@dmpots dmpots merged commit fae64ad into llvm:main Dec 2, 2025
10 checks passed
kcloudy0717 pushed a commit to kcloudy0717/llvm-project that referenced this pull request Dec 4, 2025
This commit modifies the dwarf expression evaluator in how we handle the
deref operation for register and implicit locations on the stack. For a
typical memory location a deref operation will read the value from
memory. For register and implicit locations the deref operation will
read the value from the register or its implicit location. In lldb we
eagerly read register and implicit values and push them on the stack so
the deref operation for these becomes a "no-op" that leaves the value on
the stack and updates the tracked location kind.

The motivation for this change is to handle `DW_OP_deref*` operations on
location descriptions as described by the heterogenious debugging
[extensions](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-4-4-register-location-description-operations).

Specifically, for register locations it states

> These operations obtain a register location. To fetch the contents of
> a register, it is necessary to use DW_OP_regval_type, use one of the
> DW_OP_breg* register-based addressing operations, or use DW_OP_deref*
on
> a register location description.

My understanding is that this is the intended behavior from dwarf5 as
well and is not a change in behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants